A crucial issue of current text generation models is that they often uncontrollably generate factually inconsistent text with respective of their inputs. Limited by the lack of annotated data, existing works in evaluating factual consistency directly transfer the reasoning ability of models trained on other data-rich upstream tasks like question answering (QA) and natural language inference (NLI) without any further adaptation. As a result, they perform poorly on the real generated text and are biased heavily by their single-source upstream tasks. To alleviate this problem, we propose a weakly supervised framework that aggregates multiple resources to train a precise and efficient factual metric, namely WeCheck. WeCheck first utilizes a generative model to accurately label a real generated sample by aggregating its weak labels, which are inferred from multiple resources. Then, we train the target metric model with the weak supervision while taking noises into consideration. Comprehensive experiments on a variety of tasks demonstrate the strong performance of WeCheck, which achieves a 3.4\% absolute improvement over previous state-of-the-art methods on TRUE benchmark on average.
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
建模多代理系统需要了解代理的相互作用。这样的系统通常很难建模,因为它们可以涉及各种类型的相互作用,以促进丰富的社会行为动态。在这里,我们介绍了一种用于准确建模多代理系统的方法。我们介绍了使用多重注意(IMMA)的相互作用建模,这是一种前向预测模型,该模型使用多重潜在图代表多种独立类型的相互作用,并注意对不同优势的关系。我们还介绍了渐进层培训,这是该体系结构的培训策略。我们表明,我们的方法在轨迹预测和关系推理中的最先进模型优于最先进的模型,涵盖了三个多代理方案:社交导航,合作任务成就和团队运动。我们进一步证明,我们的方法可以改善零拍的概括,并使我们能够探究不同的相互作用如何影响代理行为。
translated by 谷歌翻译
具有多核光纤(MCF)无透镜微观镜片的定制光的产生广泛用于生物医学。然而,用于这种应用的计算机生成的全息图(CGHS)通常由迭代算法产生,这需要高计算工作,限制在体内光源刺激和光纤细胞操纵中的高级应用。纤维芯的随机和离散分布对CGHS引起了强烈的空间偏大,因此,非常需要一种能够快速生成MCF的量身定制的CGHS的方法。我们展示了一种新型阶段编码器深神经网络(Coreenet),它可以在近视频速率下为MCF产生精确定制的CGHS。模拟表明,与传统的CGH技术相比,CoreNet可以将计算时间加速两个大小,并增加产生的光场的保真度。首次,实时生成的定制CGHS在飞行中加载到仅相位的SLM,用于通过MCF微内窥镜在实验中产生动态光场。这铺设了实时细胞旋转的途径和几种需要在生物医学中实时高保真光传递的几种进一步的应用。
translated by 谷歌翻译
Recent advances in operator learning theory have improved our knowledge about learning maps between infinite dimensional spaces. However, for large-scale engineering problems such as concurrent multiscale simulation for mechanical properties, the training cost for the current operator learning methods is very high. The article presents a thorough analysis on the mathematical underpinnings of the operator learning paradigm and proposes a kernel learning method that maps between function spaces. We first provide a survey of modern kernel and operator learning theory, as well as discuss recent results and open problems. From there, the article presents an algorithm to how we can analytically approximate the piecewise constant functions on R for operator learning. This implies the potential feasibility of success of neural operators on clustered functions. Finally, a k-means clustered domain on the basis of a mechanistic response is considered and the Lippmann-Schwinger equation for micro-mechanical homogenization is solved. The article briefly discusses the mathematics of previous kernel learning methods and some preliminary results with those methods. The proposed kernel operator learning method uses graph kernel networks to come up with a mechanistic reduced order method for multiscale homogenization.
translated by 谷歌翻译
Behavior constrained policy optimization has been demonstrated to be a successful paradigm for tackling Offline Reinforcement Learning. By exploiting historical transitions, a policy is trained to maximize a learned value function while constrained by the behavior policy to avoid a significant distributional shift. In this paper, we propose our closed-form policy improvement operators. We make a novel observation that the behavior constraint naturally motivates the use of first-order Taylor approximation, leading to a linear approximation of the policy objective. Additionally, as practical datasets are usually collected by heterogeneous policies, we model the behavior policies as a Gaussian Mixture and overcome the induced optimization difficulties by leveraging the LogSumExp's lower bound and Jensen's Inequality, giving rise to a closed-form policy improvement operator. We instantiate offline RL algorithms with our novel policy improvement operators and empirically demonstrate their effectiveness over state-of-the-art algorithms on the standard D4RL benchmark.
translated by 谷歌翻译
Universal Image Segmentation is not a new concept. Past attempts to unify image segmentation in the last decades include scene parsing, panoptic segmentation, and, more recently, new panoptic architectures. However, such panoptic architectures do not truly unify image segmentation because they need to be trained individually on the semantic, instance, or panoptic segmentation to achieve the best performance. Ideally, a truly universal framework should be trained only once and achieve SOTA performance across all three image segmentation tasks. To that end, we propose OneFormer, a universal image segmentation framework that unifies segmentation with a multi-task train-once design. We first propose a task-conditioned joint training strategy that enables training on ground truths of each domain (semantic, instance, and panoptic segmentation) within a single multi-task training process. Secondly, we introduce a task token to condition our model on the task at hand, making our model task-dynamic to support multi-task training and inference. Thirdly, we propose using a query-text contrastive loss during training to establish better inter-task and inter-class distinctions. Notably, our single OneFormer model outperforms specialized Mask2Former models across all three segmentation tasks on ADE20k, CityScapes, and COCO, despite the latter being trained on each of the three tasks individually with three times the resources. With new ConvNeXt and DiNAT backbones, we observe even more performance improvement. We believe OneFormer is a significant step towards making image segmentation more universal and accessible. To support further research, we open-source our code and models at https://github.com/SHI-Labs/OneFormer
translated by 谷歌翻译
Adaptive mesh refinement (AMR) is necessary for efficient finite element simulations of complex physical phenomenon, as it allocates limited computational budget based on the need for higher or lower resolution, which varies over space and time. We present a novel formulation of AMR as a fully-cooperative Markov game, in which each element is an independent agent who makes refinement and de-refinement choices based on local information. We design a novel deep multi-agent reinforcement learning (MARL) algorithm called Value Decomposition Graph Network (VDGN), which solves the two core challenges that AMR poses for MARL: posthumous credit assignment due to agent creation and deletion, and unstructured observations due to the diversity of mesh geometries. For the first time, we show that MARL enables anticipatory refinement of regions that will encounter complex features at future times, thereby unlocking entirely new regions of the error-cost objective landscape that are inaccessible by traditional methods based on local error estimators. Comprehensive experiments show that VDGN policies significantly outperform error threshold-based policies in global error and cost metrics. We show that learned policies generalize to test problems with physical features, mesh geometries, and longer simulation times that were not seen in training. We also extend VDGN with multi-objective optimization capabilities to find the Pareto front of the tradeoff between cost and error.
translated by 谷歌翻译
对移动障碍的检测和细分,以及对当地环境的未来占用状态的预测,对于自动驾驶汽车,必不可少的自动驾驶行动至关重要。在本文中,我们提出了一个框架,该框架使用深层神经网络体系结构将两个功能集成在一起。我们的方法首先检测到现场移动对象的段,并使用此信息来预测自动驾驶汽车周围环境的时空演化。为了解决静态动态对象分割和环境预测模型直接集成的问题,我们建议在整个框架中使用基于占用的环境表示。我们的方法在现实Waymo打开数据集上进行了验证,并证明了比基线方法更高的预测准确性。
translated by 谷歌翻译
与2D车道相比,实际3D车道数据很难准确收集。在本文中,我们提出了一种仅使用2D车道标签训练3D车道的新方法,称为弱监督的3D车道检测WS-3D车道。通过在相邻车道上的恒定车道宽度和相等高度的假设,我们间接监督训练中的3D车道高度。为了克服数据收集过程中相机音调动态变化的问题,提出了相机音调自校准方法。在锚固表示中,我们提出了一个具有改进的非限量抑制(NMS)方法的双层锚,该方法使基于锚的方法可以预测两条接近的车道线。实验是在两种监督方法下在3D-LANENEN的基础上进行的。在弱监督的环境下,我们的WS-3D车道的表现优于先前的3D-LANEN:APOLLO 3D合成数据集的F得分上升到92.3%,而F1在3DDLANES上上升到74.5%。同时,在纯监督环境中的WS-3D车道可以提高更多的增量,并且优于最先进的设置。据我们所知,WS-3D车道是在弱监督环境下进行3D车道检测的第一次尝试。
translated by 谷歌翻译